'  RESOLUTION  TEST  CHART 

>UR£AU  OF  STANOARDS- 1963- A 


OTIC  FILF  copy  AD- A 161  964 


SCHOOL  OP  OPERATIONS  RESEARCH 
AND  INDUSTRIAL  ENGINEERING 
COLLEGE  OF  ENGINEERING 
CORNELL  UNIVERSITY 
ITHACA,  NEW  YORK  14853 


m 


TECHNICAL  REPORT  NO.  594 

October,  1983 
(Revised  September,  1984 )*> 


INITIALIZATION  EFFECTS  IN 
COMPUTER  SIMULATION  EXPERIMENTS 


Lee  Schruben 


David  Goldsman 


DT1C 

^ELECTEI 

h  DEC  0  6  1985  [ 


This  research  was  supported  by  the  Office  of  Naval  Research  under 
contract  N00014-81-K-0037.  David  Goldsman  is  currently  in  the 
School  of  Industrial  and  Systems  Engineering,  Georgia  Institute  of 
Technology,  Atlanta,  GA  30332. 


_ _  — 

^  a'*- _ 1 

**  J  _ _ •—  -'*• 


85 


•  * 


Absirac  t 


^  relating  to  i  ni  ti  al  i  2 1  ng  a  simulation  program  are 

discussed,  and  suggested  procedures  tor  handling  this  source  of  er 


are  reviewed. 


Accession  For 

"ntis  GRA&I 
DTIC  TAB 
Unannounced 
Justif lcation. 


By - 

Distribution/  __ 

Availability  Codes 
Avail  and/or 
Dlst  Special 


quality 

INSPEOTEO 


1  .  Ini  r  t if h * r  t  i  on 

When  a  simulation  program  is  run  on  a  computer,  initial  values 
•♦or  all  variables  must  be  specified.  The  experimenter  typically  does 
not  know  what  initial  values  are  appropriate  for  all  of  the  variables 
in  the  program.  These  values  are  often  selected  in  a  rather  arbitrary 
manner  as  a  matter  of  convenience;  such  a  selection  might  have  a 
significant  influence  on  the  outcome  of  the  experiment.  If  the 
setting  of  the  initial  conditions  for  a  run  has  a  major  but 
unrecognized  effect,  then  the  results  from  the  run  can  be  misleading. 
Thus,  initialization  can  be  a  serious  source  of  error  in  simulation 
experiments. 

The  simulation  literature  contains  many  papers  on  'the 
i ni ti al i zat i on  bias  problem'.  Probably  as  much  has  been  written  about 
this  topic  as  any  other  single  area  of  simulation  output  analysis,  but 
there  does  not  seem  to  be  any  definitive  solution  to  the  problem. 
Indeed,  no  universally  accepted  definition  of  the  problem  exists. 

This  is  because  there  is  not  just  a  single  issue  in  initialization  of 
simulations;  there  are  several. 


One  focus  has  been  on  obtaining  accurate  (low  bias)  and  precise 
(low  variability)  estimators  of  some  'steady  state'  measure  of 
simulated  system  performance.  Say  that  the  simulated  process  at  time 
t,  ,  has  a  distribution  function  denoted  by  .  The  problem  is  to 
estimate  properties  (e.g.,  moments,  quantiles)  of  the  limiting  (steady 
state)  distribution,  F  s  lim^^F^,.  The  difficulty  is  that  in  many 
simulations  (e.g.,  the  M/M/l  queue  with  high  traffic  intensity),  the 
limiting  distribution  is  asyaptotic.  None  of  the  data  in  the  output 


series  will  be  sampled  from  the  distribution  function  F.  This 


problem  hat.  thor  etler  jttJcs  similar  to  forecasting  problems  [see 
Kimbler,  ef  a/.  (1979)  and  Snell  <1980)3. 


Another  concern  is  that  the  effects  of  initialization  do  not  lead 
to  erroneous  experimental  conclusions.  If  the  purpose  of  the 
simulation  study  is  steady  state  system  performance  estimation,  then 
the  same  steady  state  issues  mentioned  above  may  be  pertinent. 

However,  many  simulation  models  are  used  as  aids  in  (non-steady  state) 
system  design  and  evaluation.  In  design  studies,  the  concern  is 
whether  a  better  design  could  be  possible  if  initialization  errors 
were  not  present.  Evaluation  studies  question  whether  or  not  the 
presence  of  initialization  errors  favors  other  than  the  best  of 
several  alternatives  that  are  being  examined.  In  short,  does  the 
presence  of  initialization  effects  potentially  result  in  an  incorrect 
deci sion? 

In  Section  2  of  this  paper,  some  examples  of  simulation 
initialization  effects  are  presented.  Section  3  discusses  the 
mathematical  aspects  of  initializing  a  simulation  program.  Suggested 
procedures  for  detecting  and  handling  this  source  of  error  are 
reviewed  in  Section  4. 

2.  Some  examples 

Some  simple  examples  illustrate  various  problems  that  can  arise 
when  initializing  a  simulation  program:  If  a  simulation  of  a  new 
factory  is  initiated  with  no  work  in  progress,  then  the  production  of 
finished  goods  will  be  relatively  low  early  in  the  run.  Products  have 
not  yet  had  time  to  flow  through  the  system  (and  so  there  will  be 
little  congestion  in  the  system  for  some  time).  A  naive  statistical 


analyt.it.  oi  t  hr  data  from  such  a  simulation  might  conclude  that  the 
production  it  positively  correlated  with  congestion  on  the  fattory 
•floor.  This  correlation  is  an  artifact  caused  by  the  way  in  which  the 
experiment  was  run;  the  correlation  is  not  a  natural  characteristic 
of  the  system  being  studied.  Nevertheless,  the  study  might  suggest 
that  one  way  to  increase  production  is  to  make  the  factory  more 
congested.  Low  observed  machine  utilizations  (perhaps  again  due  to 
the  simulated  factory  being  initially  empty)  could  lead  the  analyst  to 
recommend  that  fewer  machines  than  originally  planned  be  installed. 

The  use  of  fewer  machines  could  then  result  in  severe  system 
bottlenecks  as  production  increases. 

Visually  detecting  initialization  effects  in  estimators  of  system 
performance  can  sometimes  be  a  non-trivial  task.  Figure  1  is  the 
output  (customer  waiting  times)  of  a  simulated  queueing  system  which 
was  initially  started  with  no  customers  in  the  system.  The  initial 
portion  of  the  run  does  not  appear  to  differ  radically  from  the  rest 
of  the  run.  Actually,  the  average  waiting  time  for  the  first  50 
customers  is  10  minutes  while  that  for  the  first  500  customers  is  27 
minutes!  The  relatively  low  observed  waiting  times  for  early 
customers  are  due  in  part  to  the  empty  and  idle  initialization  of  the 
system. 

Me  continue  with  the  above  queueing  example.  Suppose  that  the 
system  closes  each  night,  even  if  there  are  still  customers  waiting  in 
line  for  service.  If  some  of  these  unserved  customers  decide  to 
return  when  the  system  re-opens  the  next  morning,  there  will  be  a 
backlogging  of  demand  from  day  to  day.  In  this  case,  each  day  is  not 
independent;  starting  the  simulation  with  the  same  initial  condition 
of  no  demand  is  not  appropriate.  The  simulation  should  have  each  day 


per  for  mane  rt  o(  systems  ifo  to  retain  the  current  system.  1h>&  favo  rs 
the  status  quo  over  potential  new  systems  and  policies  that  may  in 
tact  be  superior. 


Unf ortunatel y ,  many  simulation  initialization  errors  are  not  as 
transparent  as  in  the  above  examples.  Problems  of  bias  (e.g., 
underestimated  machine  utilizations)  and  spurious  secondary 
correlations  in  output  measures  due  to  correlations  with  the  initial 
conditions  (e.g. ,  congestion  correlated  with  production)  can  be 
somewhat  difficult  to  recognize. 

Of  course,  there  are  situations  where  initialization  effects  are 
not  of  major  concern  -  for  instance,  models  in  which  the  random 
variables  corresponding  to  the  simulation  output  are  approximately 
independent  and  identically  distributed.  However,  in  many  models, 
careful  examination  will  indicate  that  such  assumptions  necessary  to 
ignore  initialization  effects  are  not  easily  supported. 

3.  Mathematical  effects  of  initialization 

3. 1  Estimator  bias 

We  examine  the  first-order  autoregressive  process  £AR(1>3, 

»  u  +  +  et,  t  =  0,1,2,..., 

where  lal  <  1,  Ce^l  is  a  sequence  of  independent  identically 

2 

distributed  normal  random  variables  with  mean  0  and  variance  1— a  ,  ant 

u  s  lim.  EC Y. 3  is  the  so-called  steady  state  process  mean.  CThe 
tws  t 

AR  ( 1 )  process  is  a  simple  model  which  is  useful  for  illustrating  many 
of  the  issues  in  simulation  initialization.  The  lag  j  serial 
correlation  for  this  process  is  Corr(Y.,Y.  .)  *  a*^*.3 

l  i  j 

Suppose  that  we  are  interested  in  estimating  u  and  that  the 


initial  observation  Y^  t  0.  Note  that  EtY^l 


u  ( 1  •  o  >  4  oE  t  Y 


So 


an  elementary  calculation  yields  ECY^D  ■  jid-oS;  the  first  tew  Y^’s 
are  quite  biased  as  estimators  ofp. 


3.2  Estimator  variance 

The  initial  conditions  also  have  an  effect  on  the  variance  of 

estimators.  Consider  the  familiar  M/h/1  queue  with  traffic  intensity 

less  than  1.0.  Let  fY^:  i*=0, 1 ,2,. .  .  >  denote  the  customer  waiting  time 

process.  Suppose  again  that  Y^  s  0.  It  is  noteworthy  that  the 

variance  of  early  Y.'s  is  actually  smaller  than  the  variance  of  later 

i 

Yi 's.  I.e.,  Var(Ys|Y^=0)  <  Var  (Y^  I  Y^O)  ,  where  s  is  'small*  and  t  is 

'large. '  This  phenomenon  is  illustrated  in  Figure  2,  where  we  run  a 
number  of  replications  of  an  h/M/1  queue,  each  of  which  is  started 
empty  (only  a  pseudo-random  number  seed  is  changed.)  Early  in  each 
run,  the  simulated  system  has  not  had  time  tD  change  substantially 
from  its  initial  empty  state. 


3.3  Mean  squared  error  of  estimator 

In  point  estimation,  accuracy  (measured  by  bias)  and  precision 
(measured  by  variance)  have  been  considered  important  performance 
criteria.  There  is  usually  a  trade-off  between  estimator  bias  and 
variance;  decreasing  one  often  means  that  the  other  will  increase. 

The  mean  squared  error  (mse)  of  a  point  estimator  is  the  squared  bias 
of  the  estimator  plus  its  variance;  the  mse  is  a  popular  (but  more  or 
less  arbitrary)  criterion  for  combining  the  bias  and  variance  into  a 
single  quantity  Csee,  for  example,  Blomqvist  (1970),  Fishman  (1972), 
Law  (19B3),  and  Wilson  and  Pritsker  (197Ba)3. 


waiting  times  Yj 


Figure  2:  Some  realizations  of  an  M/M/1  a.ueue  waiting  time  process. 

Notice  that  V(Yg|  YQ=0)  <  V(Yt  j  YQ=0)  when  s  is  ‘small’  and 

t  is  'large.' 


F  i  shman  (197?)  studies  the  effects  of  initial  conditions  on  the 
mse  of  the  random  variable  correspond! ng  to  thr  sample  mean  (as  an 
estimator  of  u>  for  the  AR(1)  process  described  above.  He 
demonstrates  that  the  common  practice  of  letting  the  simulation  warm 
up  before  beginning  the  collection  of  data  (known  as  output 
truncation)  is  not  necessarily  advisable  (from  a  mse  point  of  view). 

Say  that  we  have  two  independent  output  series  from  an  AR(1) 
process  and  that  one  series  has  been  truncated  while  the  other  has  not 
been  truncated.  Snell  and  Schruben  (1979)  show  that  under  certain 
conditions,  the  non-truncated  series  yields  a  mean  estimator  with  a 
loner  mse  than  that  from  the  truncated  series.  Several  points  in 
their  paper  are  worth  mentioning.  First,  consider  the  space  formed  by 
the  possible  values  for  a,  Var (i^) ,  YQ  (the  initial  observation),  and 
certain  other  factors.  If  the  run  is  rather  long,  there  will  be  a 
large  region  of  this  space  where  no  truncation  is  called  for. 

However,  no  matter  how  long  the  series  is  run,  there  are  initial 
conditions  which  are  so  atypical  that  truncation  will  reduce  the  mse. 
The  benefit  is  reduced  as  the  series  becomes  longer,  but  truncation  is 
still  of  value  in  some  cases.  This  contradicts  the  notion  that 
truncation  is  not  beneficial  if  a  simulation  is  run  a  very  long  time 
[see  Madanski  (1976)3.  It  can  be  less  costly  to  truncate  some  initial 
data  than  to  overwhelm  poor  initial  conditions  with  a  long  run. 

The  second  point  is  that,  from  a  mse  perspective,  positive  and 
negative  serial  correlations  have  much  the  same  effect.  The  region 
where  no  truncation  is  optimal  turns  out  to  be  larger  for  positive  a 
than  for  negative  o  (because  for  negative  o  the  sign  of  the  serial 
correlation  alternates  with  lag).  For  the  same  magnitude  of  o,  the 
amount  of  positive  correlation  in  the  series  is  greater  when  this 


coefficient  is  positive  than  the  amount  of  negative  correlation  when 
this  coefficient  is  negative. 

Snell  and  Schruben  also  derive  optimal  weightings  o4  mean 
estimators  4or  the  AR  ( 1 )  process.  Least  squares  estimators  (both 
ordinary  and  weighted)  o4  the  process  mean  are  presented.  Data 
truncation  can  be  viewed  as  a  special  case  o4  weighting  the  output 
from  a  simulation.  The  weight  on  an  observation  is  zero  if  it  is 
truncated  and  one  if  it  is  retained  for  analysis.  Expressions  are 
given  for  finding  optimal  truncation  points  for  several  different 
estimator  performance  criteria  (in  particular,  for  minimizing  the 
mse. ) 

3.4  Confidence  interval  estimators 

3.4.1  Batched  means  and  independent  replications 

Suppose  we  wish  to  calculate  confidence  intervals  for  the  steady 

state  mean  u  =  lim^^ECY^l  of  some  process  Many  experimenters 

use  the  method  of  batched  meanst  Divide  the  process  Yj,...,Yn  into  b 

contiguous  batches  of  m  Y^ ’ s  each  (assume  for  convenience  that  m 

divides  n).  Hence,  Y,.  ...,Y,  constitute  the  j-th 

’  (j-l)m+l*  (j-l>m+2’  jm 

batch,  j=l,...,b.  Define  the  batched  means: 


Y.  =  -  5T  ,  Y,.  ..  .,  j  =  l,...,b. 

j  m  tt=l  ( j-1 ) m+t  *  J 

When  applying  the  method  of  batched  means,  the  Y^'s  need  to  be 
approximately  iid  normal  random  variables  with  unknown  mean  and 
variance.  Approximate  100(l-a)7.  confidence  intervals  for  u  are  then 


given  by: 


b-  1  , 1 -a/2 


m  1  r«^ 

where  Y  c  —  V  .  Y_  is  the  random  variable  corresponding  to  the 
n  n  t-t»  1  t 

grand  sample  mean,  t.  is  the  upper  tj  quantile  of  the  t(k) 

r  i^l 

distribution,  and  5“"  e  V*3  (Y  -Y  )^/(b-l). 

tj=l  j  n 

Even  if  the  batched  means  are,  in  -fact,  approximately  independent 
and  normally  distributed,  initialization  bias  can  still  cause  a  number 
of  problems  with  the  above  batched  means  confidence  interval 
estimator.  For  instance,  Y^  might  be  quite  biased  as  an  estimator  of 
u;  this  would  result  in  a  confidence  interval  estimator  which  is 
incorrectly  centered.  Further,  S  might  be  a  biased  estimator  for  the 
underlying  process  variance, 

a  1 im  nVar (Y  >  . 

n-*a>  n 

r> 

If  ECS*"!  >>  o  ,  then  confidence  intervals  which  are  too  wide  to  be  of 

2 

practical  use  could  result.  Alternatively,  if  ECS  D  <<  ,  the 

resulting  intervals  might  be  invalid  Ci.e.,  the  probability  that  they 
cover  u  would  be  <<  1-al. 

The  method  of  replications  may  also  be  unacceptable  for  use  in 
the  presence  of  initialization  bias.  This  method  samples  b 
independent  streams  (of  m  observations  each)  from  the  stochastic 
process.  The  sample  means  from  each  of  the  b  replications  are 
calculated,  and  confidence  intervals  are  formed  via  (•*)  . 
Initialization  bias  is  easily  seen  to  be  more  of  a  problem  here  than 
in  the  method  of  batched  means  -  this  is  because  each  of  the  b 
replication  streams  can  be  biased  by  the  initial  transient  (whereas 
only  the  single,  long  stream  of  the  batched  means  method  can  have  any 
bias).  In  view  of  this  argument.  Law  and  Kelton  (19B2)  recommend  the 
use  of  batched  means  over  independent  replications. 


The  regenerative  method  of  confidence  interval  formulation  £cf. 
Crane  and  Iglehart  (1975)  and  Crane  and  Lemoine  (1977)3  is  structured 


in  such  a  way  that  it  is  not  directly  affected  by  initialization 
bias.  Regeneration  relies  on  the  fact  that  many  (covariance 
stationary)  stochastic  processes  <Y^,  t=0, 1 ,2, . . . >  'start  from 
scratch'  probabilistically  at  so-called  (random)  regeneration  times 
0  <  tj  <  t2  <  ...  The  cycles  {Yt:  tk  <  t  <  tk+1>,  k  >  1,  are 
independent  and  identically  distributed. 

For  k  .>  1,  define  the  k— th  cycle  length  Ak  s  tk+j  —  tk»  and  the 
random  variable  corresponding  to  sum  of  the  observations  from  the  k— th 
cycle. 


x,  e  ik;ri  y  . 

k  fai=t,  i 

k 


Clearly,  the  A  's  and  X  ' 
K  K 


s  are  iid. 


The  point  estimator  for  uslim.  Y.  is  u  e  —  T7  .  Y. ,  where  n 

t-»®  t  n  t*i  =  l  l  T 


is  the  total  number  of  observations  taken  (starting  at  time  t ^  =  1 , 


say).  Suppose  for  convenience  that  we  terminate  the  simulation  after 
exactly  N  regeneration  cycles;  so  n  =  2-i-l  '  Then  is  easY  to 

A 

see  that  m  =  where  we  use  the  obvious  notation. 

Now  define  ZR  s  Xk  -  pAk,  1  <  k  <  N;  so  EEZkD  =  0  and  Var(Zk)  = 
Var(Xk)  +  u2Var(Ak)  -  2*iCov  ( X  k ,  Ak> .  A  point  estimator  for  Var(Zk)  is 


given  by: 


-11- 

•z  '  s;  *  “Zsfi  -  -uBx«  •  Mhpr' 

EX  *  t-l'\  -  5n>2/<N-1),  S*  s  ^.,<Ak  -  V2/«N-1>.  .nd 

B e  ^=1  <Xk  -  XN>  -  An>/<N-1>.  It  can  then  be  shown  that 
approximate  100(l-a)7.  confidence  intervals  for  u  are  given  by: 

Pr{  u  e  u  i  *l  a/2n/AN>  C  <B^  +  »2S2  -  2uSxft)/N31/2  )  s  1  -  a,  where 

z,  ..  is  the  upper  l-a/2  quantile  of  the  Nor (0,1)  distribution. 

1  *•0/2 

A  _  _ 

Note  that  Efu3  =  ECXkl/Akl3  *  ECX.3/EtA.3  «=  u  -for  finite  N; 

N  N  1  1 


Crane  and  Lemoine  give  confidence  intervals  which  use  jackknifing  Ccf. 
Mi_l.ler  (1974)3  in  order  reduce  the  effects  of  this  bias. 

Given  that  the  process  will  eventually  return  to  its  initial 

state,  the  method  of  regeneration  is  particularly  appealing;  in  this 
case,  the  initial  state  might  serve  as  a  logical  choice  for 
regeneration  points.  However,  perhaps  the  main  drawback  of  this 
method  is  that  the  regeneration  cycle  lengths  might  be  prohibitively 
long  (especially  if  the  system  is  initialized  in  a  very  atypical 
state).  The  reader  should  peruse,  e.g.,  Bratley,  Fox,  and  Schrage 
(1983)  for  a  thorough  list  of  advantages  and  disadvatages  of  the 
regeneration  method.  See  Meketon  and  Heidelberger  (1982)  for  an 
interesting  discussion  pertaining  in  part  to  i ni t i al izati on  effects. 


3.4.3  Measures  of  performance 


In  the  recent  simulation  literature,  confidence  interval 


t: 


•  i?- 

estimator  prr  <on«ance  has  been  measured  using  the  following  criteria 
Isee,  e.g.,  Schmeiser  (1982)3: 

1)  observed  interval  coverage  frequency  (of  the  true  population 
mean,  say)  for  a  sample  of  confidence  intervals  that  use  a 
certain  initialization  bias  control  procedure. 

2)  interval  width,  typically  measured  by  the  sample  average 
half-width  of  the  observed  confidence  intervals. 

3)  interval  stability,  usually  measured  by  the  sample  variance 
of  the  observed  interval  half -widths. 

Some  authors  also  use  the  coefficient  of  variation  (c.v.)  of  the 
interval  half-width  as  another  criterion.  This  can  be  misleading: 
the  c.v.  might  be  reduced  (an  apparent  improvement)  by  merely 
increasing  the  interval  width  (not  an  improvement). 

As  alluded  to  previously,  initialization  bias  might  result  in 
poor  confidence  interval  performance  in  terms  of  all  of  the  above 
criteria.  The  effect  of  bias  on  interval  coverage  can  be  seen  using 
the  coverage  function  described  in  Schruben  (1980):  Suppose  a 
particular  confidence  interval  estimator  is  under  study,  and  we  obtain 
a  (large)  number  of  independent  realizations  of  that  estimator.  The 
observed  confidence  level  is  defined  to  be  that  percent  of  the 
realizations  which  cover  the  true  parameter  of  interest.  The  coverage 
function  is  simply  a  plot  of  the  desired  confidence  level  versus  the 
observed  confidence  level.  For  instance.  Figure  3  [taken  from 
Schruben  (1980)3  shows  how  initialization  bias  might  affect  the 
validity  of  confidence  intervals  formed  using  the  method  of 
replications  -  observed  coverage  levels  are  generally  below  the 
desired  coverage  levels. 


The*  trade-off  between  interval  halt-width  and  coverage  tor 
various  confidence  interval  estimators  can  be  studied  using  a 
graphical  technique  developed  in  Kang  and  Schmeiser  (1983).  They 
suggest  plotting  the  observed  confidence  interval  center  points 
against  the  observed  half-widths  from  a  sample  of  confidence  interval 
realizations.  The  observed  coverage  frequency  is  then  given  by  the 
percentage  of  such  points  which  fall  within  45°  of  the  vertical  line 
drawn  at  the  true  population  mean  p.  In  Figure  4a,  we  plot 
realizations  from  a  confidence  interval  procedure  inherently  yielding 
low  coverage  and  high  interval  half-widths;  such  a  plot  might  result 
from  the  presence  of  initialization  bias.  Figure  4b  illustrates  the 
analogous  plot  for  a  superior  procedure  (i.e. ,  a  procedure  which  has, 
among  other  things,  mollified  the  initialization  effects);  the 
coverage  is  close  to  the  desired  level  arvc*  the  half~widths  are 
somewhat  shorter  than  those  from  the  previous  diagram. 


Section  4.1  discusses  methods  for  ascertaining  whether  or  not 
(significant)  initialization  bias  exists.  In  Sections  4.2  and  4.3,  we 
give  suggestions  for  dealing  with  the  problems  caused  by 
initial ization. 


4.1  Detection  of  initialization  bias 

4.1.1  Graphical  methods 

Current  practice  seems  to  be  to  simply  look  at  plots  of  output  to 
try  to  visually  detect  any  initialization  effects.  This  can  be  very 
time  consuming  in  large  scale  simulation  experiments  involving  many 
runs,  and  visually  scanning  the  output  might  not  actually  be  helpful 
(recall  Figure  1  of  this  paper).  Smoothing  the  output  (by  using  a 
moving  average,  for  example)  as  suggested  in  Sargent  (1979)  or  Welch 
(1981)  makes  visual  analysis  easier.  Averaging  observations  across 
several  runs  (i.e. ,  the  first  observations  from  each  run  are  averaged, 
the  second  observations  are  averaged,  and  so  forth)  also  aids  in  the 
detection  of  initialization  effects  upon  the  mean  of  the  output 
process.  The  'CUSUM'  type  plots  in  Schruben  (1979)  are  particularly 
sensitive  to  changes  in  the  mean  of  a  stochastic  process;  these  plots 
can  be  most  useful  in  detecting  ini tial ization  effects. 

Transf ormation  of  the  data  (e.g.,  by  taking  the  logarithms  or  square 
roots  of  the  observations)  might  also  help  the  experimenter  in 
detection  of  an  initial  transient. 


Tests  for  initialization  b 


Kelton  (19B0)  proposes  an  intuitively  appealing  sequential  teat 
4 or  bias  in  which  the  simulation  output  regressed  on  simulated  clock 
time  is  tested  for  a  zero  slope.  - 

Other  statistical  tests  4or  initialization  effects  are  suggested 
in  Schruben  (19B2)  and  Schruben,  Singh,  and  Tierney  (19B3).  In  these 
tests,  a  quality  control  viewpoint  is  taken  (to  control  for 
inconsistent  'quality'  in  the  output  data).  Goldsman  (1984)  gives 
generalizations  of  these  tests. 

Goldsman  assumes  that  the  stochastic  process  Y  ,...,Y  can  be 

l  n 

modelled  as: 

Y.  e  ^  X.,  i  =  l , . . .  ,n, 

where  EtY.  3  “  p.,  i=l,...,n,  and  the  Xi 's  are  stationary.  If  Pj  = 

“  .  •  •  *  i*n ,  we  say  that  no  initialization  bias  is  present; 

otherwise,  bias  is  present.  [The  existence  of  initialization  bias  in 

higher  order  moments  is  studied  in,  e.g.,  Schruben  (1981). 3  The  idea 

behind  the  tests  is  to  first  partition  Yj,...,Yn  into  two  contiguous, 

non-overlapping  portions.  A  process  variance  estimate  calculated  from 

the  first  portion  of  a  realization  of  Y  ,...,Y  is  then  compared  to  an 

in 

estimate  from  the  second  portion.  If  it  is  determined  that  the  two 
estimates  are  'significantly  different',  the  null  hypothesis  of  no 
bias  is  rejected. 

As  an  example,  we  construct  some  simple  tests  for  bias.  Suppose 

that  the  first  portion  of  Y,  , ...,Y  is  to  be  divided  into  b'  batches 

I  n 

of  m  observations  each,  while  the  second  portion  is  to  be  divided  into 
b-b '  batches.  Define: 

y  -  i  yj  y 
i,j  j  £k*l  (i-Dm+k* 

the  average  of  the  first  j  Y's  from  batch  i,  i  =  l,...,b,  j=l,...,m. 
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Undrr  mild  conditions  t see  Goldsman  (1984)3,  it  can  be  shown  that  as  m 
increases. 


0  .  .  e  m  ^  -  [V  "  cr-Ib  ,  V  ] 

0,b  wi=l  1  i ,m  b  fcj»l  j,m  J 
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where  a  and  Y  are  defined  as  in  Section  3,  X  <v>  is  the  x' 

n 

distribution  with  v  degrees  of  freedom,  and  5  indicates 


convergence  in  distribution.  Also, 


°0,b-b-  E  m 


y>  r  v  .  Tb  v  l 

fc>i«b'  +  l*-  i  ,m  b-b  '  L  j«=b  '  +  1  j,m  ■« 
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S  e  x  (b-b  -1)  , 


1  <  b-b'  <  b-1. 


q  E  yb  .  [  ,  J<Y-  -  Y  >  l2  5  a2x2  (b ' )  ,  1  <  b'  <  b-1, 

1  ,b  3  4-1  =  1  t  Cj*=i  l  ,m  i,j  J  — 

m  — m 

Q*  .  s  Q,  .  -  Q,  .  .  5  a2X2(b-b),  1  <  b-b'  <  b-1. 
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Assuming  approximate  independence  of  the  batches,  we  have: 


b-b  -1  P0,b' 


0,b ' 


l,b 


b'-l  C 


.=  F<b  -1  ,b-b  -1)  and 


0,b-b ’ 


0. 


b-b'  1 ,b ' 


s  F  <b ' ,b-b ' ) , 
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where  F(Vj,vn)  is  the  F  distribution  with  and  v ^  degrees  of 
freedom,  respectively. 

If  we  were  to  use  F,  to  conduct  a  two-sided  test  for  bias,  we 

l  ,b 

would  reject  the  null  hypothesis  H^:  =  u,  yi ,  at  the  o  level  if 

F,  .  .  <  f.  .  .  .  .  or  >  f .  ,  .  .  ,  ,  where  f  _  is  the  upper 

l,b  b  ,b-b  ,a/2  b  ,b-b  ,l-a/2’  vlfV2*^ 


t|  quantile  of  the  Ffv^v^)  distribution. 


Caveats  (see  the  appropriate  references  for  details):  Kel ton's 
test  has  some  (minor)  problems  concerning  its  applicability. 


•>.V.  . 

•  ■  -  •  •  •  ■ 


V  V 

w-.v 


,V. 


•  *  •  *  »•  a  *  i  *  •  *  »  *  «  '  a  *  •  *  »  *  *  *  *  *  »  *  *  '  *  '  ^  “  »  * 


*  -  *  •.  *  . 


Goldsman's  lrt.tr  are  asymptotic  and  therefore  should  not  be  used  when 
the  experimenter  has  only  limited  data  on  hand. 

4.2  Estimation  of  steady  state  parameters 

The  goal  tor  now  is  to  find  an  initialization  bias  control 
procedure  which  yields  'good*  point  estimates  and/or  confidence 
intervals  for  the  steady  state  parameter (s>  of  interest.  If  the 
simulation  run  is  very  short,  good  estimation  may  not  be  possible, 
regardless  of  the  procedure  which  is  used.  (Until  recently, 
procedures  for  dealing  with  this  problem  have  primarily  been 
heuristic.)  In  general,  one  tries  to  obtain  as  good  an  estimator  as 
possible  recognizing  that  without  data  from  the  (asymptotic) 
population  of  interest,  compromise  will  be  necessary. 

_  4.2.1  Truncation  rules 

There  are  various  approaches  to  the  problem  of  point/conf idence 
interval  estimation.  By  far  the  most  attention  has  been  given  to 
weighting  of  the  data.  Recall  that  data  truncation  is  a  special  case 
of  weighting.  Truncation  allows  the  simulation  to  warm  up  before  data 
are  retained  for  analysis.  Many  'truncation  rules'  have  been  proposed 
and  studied  [see,  for  example,  Gafarian,  et  al.  (1978),  Kelton  (1980), 
Morisaku  (1976),  Sargent  (1979),  and  Wilson  and  Pritsker 
(1978a, 1978b) 3;  Pritsker  and  Pegden  (1979)  cite  the  following  rules 
(among  others): 

(a)  t Conway  (1963)3  Let  y^,...  ,y^  be  a  realization  of  the 
stochastic  process  under  study.  Define  the  truncation  point  t  as 
the  number  of  observations  to  be  deleted  from  the  beginning  of  the 


real  j  srat  1  on.  Choose  (the  smallest  possible)  t  Buch  that  y^.4j  H 
neither  the  minimum  nor  the  maximum  of  j ,. . . *Vn* 

(b)  [Gordon  (1969)3  Fcun  k  independent  replications  of  length  n  in 

order  to  estimate  Var ( V  ).  Choose  t  equal  to  the  value  of  n  for 

n 

which  the  estimate  of  Var(Y  )  begins  to  drop  off  as  1/n. 

n 

(c)  [Fishman  (1973)3  Choose  t  as  the  smallest  n  such  that  the 

realization  [y.  >  crosses  its  sample  mean  y  k  times  (k  must 
'l  n 

be  user-speci f i ed )  . 


(d)  [Schriber  (1974)3  Consider  the  k  most  recent  batches  of  size  m: 

yn-im+l *^n-iffl+2’*  "  ’yn-(i-l)m*  *  l,...»k. 

Define  the  k  correspondi ng  batched  means  as: 


.  .  .  1  rn-  (l-l )  m 

y ( j ,n)  =  —  > . 

m  6j=n-x 


m+1 


i  —  1  f  .  m  t  ykt 


Choose  t  as  the  smallest  n  such  that  all  of  the  y(i,n)’s  are 
within  an  interval  of  length  e  (k,  m,  and  e  must  be  specified  by 
the  user ) . 


(e)  CGafarian,  et  al.  (1977)3  Choose  (the  smallest  possible)  t 
such  that  y^  is  neither  the  minimum  nor  maximum  of  yj,...,y^. 

The  consensus  is  that  simple  truncation  rules  may  not  in  general 
perform  well  in  all  situations.  It  is  not  too  surprising  that  such 


rules  do  not  produce  good  confidence  interval  coverage  for  some  of  the 
systems  tested  in  the  literature.  This  observation  is  particularly 


trur  for  many  qurunng  systems  (the  most  popular  models  tor  testing 
simulation  procedures).  These  systems  only  asymptotically  reach 
steady  state  and  can  have  persistent  serial  dependence;  none  of  the 
data  are  from  the  (steady  state)  population  that  is  being  studied.  Of 
interest  here  would  be  the  lively  but  inconclusive  discussions  of  Fox 
(1978),  Kleijnen  (1979),  and  Schruben  (1978). 

4.2.2  Sequential  estimation 

A  strai ghtf orward  approach  for  controlling  simulation 
initialization  effects  is  to  sequentially  truncate  various  amounts  of 
the  initial  data  and  see  if  this  changes  the  decisions  indicated  by 
the  full  output  data  set.  A  sequential  procedure  similar  to  that 
suggested  in  Kelton  (1980)  might  be  adapted  effectively  in  this 
context. 

In  Heidelberger  and  Welch  (19B3),  an  automatic  test  for 
initialization  bias  based  on  Schruben  (1982)  was  included  in  a 
confidence  interval  procedure.  This  resulted  in  improved  performance 
over  an  earlier  version  of  the  procedure  that  did  not  have  any 
initialization  error  control  Usee  Heidelberger  and  Welch  (1981)3. 

Such  automatic  procedures  and  tests  for  initialization  bias  are 
especially  useful  when  there  is  a  large  amount  of  data  to  analyze  (in 
which  case  the  analyst  does  not  have  the  time  to  plot  and  look  at  all 
of  the  data).  Of  course,  the  condensation  of  information  contained  in 
data  (so  that  it  can  be  understood  more  easily)  is  one  of  the  purposes 
of  statistics.  The  better  the  information  is  condensed,  the  closer 


the  decisions  based  on  sample  statistics  will  be  to  those  based  on  the 


-  ?(»- 


4 , r , ?  Pi f  ret  model ling 

A  totally  different  approach  to  the  estimation  problem  is  to 
directly  model  the  transient  mean  function.  Snell  (19B0)  uses 
economic  growth  models  for  this  purpose.  Narasimhan,  et  ml.  (19BD 
works  with  so-called  ‘intervention  time  series'  'iodels.  Richards 
(1983)  uses  'relaxed  time  series'  modelling.  More  experience  is 
needed  with  these  promising  methods  before  their  merits  can  be  Judged. 

4.3  Experimental  design 

Another  method  for  dealing  with  simulation  initialization  is  to 
treat  its  effects  as  a  nuisance  factor  in  the  design  of  simulation 
experiments.  For  example,  the  various  initial  values  might  be 
considered  as  a  factor  in  an  ANOVA  analysis.  Several  possible  initial 
states  could  then  be  run  for  each  set  of  experimental  factors.  Dne 
might  block  such  experiments  on  the  initial  conditions  or  regress  the 
output  on  the  initial  conditions  to  try  to  control  this  source  of 
error.  Consider  the  beneficial  effects  in  terms  of  variance  reduction 
that  might  result  from  blocking  on  the  initial  conditions:  Estimators 
based  on  the  outputs  from  runs  with  the  same  initial  conditions  could 
be  expected  to  be  positively  correlated.  Further,  estimators  from 
runs  with  radically  different  initial  conditions  might  be  expected  to 
have  negative  correlations.  See  Schruben  and  Margolin  (1978)  and 
Schruben  (1979)  for  a  discussion  of  the  relationships  among  blocking, 
correlation  of  the  output  processes,  and  variance  reduction. 

The  experimental  design  approach  probably  deserves  more  attention 
than  it  has  received  although  it  may  be  of  limited  practical  value  in 
simulation  models  which  have  a  large  number  of  variables  to  be 


initialized. 


5.  Jnrw  )  u*  i  nn* 


Much  progress  has  been  made  in  dealing  with  the  difficult  and 
important  problem  of  initialization  bias.  Indeed,  simple  truncation 
rules  are  being  replaced  by  sequential  procedures  such  as  those  given 
in  Kelton  (19B0)  and  Heidelberger  and  Welch  (1983).  Various  tests  to 
detect  the  presence  of  i ni t i al i z at i on  bias  are  also  being  developed. 

We  feel  that  more  attention  should  be  given  to  the  problem  in  the 
decision  making  and  experimental  design  contexts.  Simulation 
researchers  might  also  study  the  di  ■f-ferent  issues  that  are  involved  in 
simulation  experiments  with  a  single  system  (or  single  performance 
measure)  vs.  experiments  with  multiple  systems  (or  multiple 
performance  measures).  Finally,  simulation  initialization  errors  may 
have  different  effects  depending  on  whether  the  simulations  are  used 
for  design,  optimization,  evaluation,  selection,  or  feasibility 
decisions;  some  investigation  in  this  area  is  warranted. 
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